Inapproximability of Maximal Strip Recovery: II
نویسنده
چکیده
In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given d genomic maps as sequences of gene markers, the objective of MSR-d is to find d subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant d ≥ 2, a polynomial-time 2d-approximation for MSR-d was previously known. In this paper, we show that for any d ≥ 2, MSR-d is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating MSR-d for all d ≥ 2. In particular, we show that MSR-d is NP-hard to approximate within Ω(d/ log d). From the other direction, we show that the previous 2d-approximation for MSR-d can be optimized into a polynomial-time algorithm even if d is not a constant but is part of the input. We then extend our inapproximability results to several related problems including CMSR-d, δ-gap-MSR-d, and δ-gap-CMSR-d.
منابع مشابه
Inapproximability of Maximal Strip Recovery
C. Zheng, Q. Zhu, and D. Sankoff (2007): Given two genomic maps G1 and G2, find a subsequenceG′1 of G1 and a subsequence G′2 of G2 such that the total length of strips in G ′ 1 and G ′ 2 is maximized. • A genomic map is a sequence of gene markers. • A gene marker appears in a genomic map in either positive or negative orientation. • A strip is a maximal string of at least two markers that appea...
متن کاملExact and approximation algorithms for the complementary maximal strip recovery problem
Given two genomic maps G1 and G2 each represented as a sequence of n gene markers, the maximal strip recovery (MSR) problem is to retain the maximum number of markers in both G1 and G2 such that the resultant subsequences, denoted as G ∗ 1 and G∗ 2 , can be partitioned into the same set of maximal substrings of length greater than or equal to two. Such substrings can occur in the reversal and n...
متن کاملAn Improved Approximation Algorithm for the Complementary Maximal Strip Recovery Problem
Given two genomic maps G1 and G2 each represented as a sequence of n gene markers, the maximal strip recovery (MSR) problem is to retain the maximum number of markers in both G1 and G2 such that the resultant subsequences, denoted as G ∗ 1 and G ∗ 2, can be partitioned into the same set of maximal strips, which are common substrings of length greater than or equal to two. The complementary maxi...
متن کاملA Linear Kernel for the Complementary Maximal Strip Recovery Problem
In this paper, we compute the first linear kernel for the complementary problem of Maximal Strip Recovery (CMSR) — a well-known NP-complete problem in computational genomics. Let k be the parameter which represents the size of the solution. The core of the technique is to first obtain a tight 18k bound on the parameterized solution search space, which is done through a mixed global rules and lo...
متن کاملMaximal Strip Recovery Problem with Gaps: Hardness and Approximation Algorithms
Given two comparative maps, that is two sequences of markers each representing a genome, the Maximal Strip Recovery problem (MSR) asks to extract a largest sequence of markers from each map such that the two extracted sequences are decomposable into non-intersecting strips (or synteny blocks). This aims at defining a robust set of synteny blocks between different species, which is a key to unde...
متن کامل